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Summary 


Twenty  subjects  from  the  University  of  Southern  California 
performed  a Bayesian  inference  task  in  pairs.  Like  earlier  re- 
search in  inference,  the  individuals  were  asked  to  infer  posterior 
odds  about  a pair  of  hypotheses  from  a collection  of  data.  Unlike 
the  earlier  studies,  the  individuals  were  then  required  to  aggre- 
gate their  posterior  odds  with  those  of  another  individual  who  had 
seen  a second  set  of  independent  data  samples  to  form  an  opinion 
about  the  same  pair  of  hypotheses. 

Conservatism  and  radicalism  findings  of  earlier  studies  were 
reconfirmed.  Individual  subjects'  responses  collected  before 
aggregation  showed  conservatism  in  the  high  d'  condition  and 
radicalism  in  the  low  d'  condition. 

The  aggregated  final  odds  from  the  pairs  of  subjects  seem  to 
reflect  some  confusion.  Some  of  the  subjects  apparently  used  a 
simple  and  incorrect  averaging  strategy.  Others  did  not  use  this 
strategy,  but  in  general,  pairs  of  subjects  were  unable  to  provide 
anything  but  conservative  final  odds  when  they  aggregated  their 
two  opinions. 

The  importance  of  using  real  stimuli,  the  way  the  responses 
were  elicited,  and  the  instructions  that  were  given  to  the  subjects 
are  discussed.  Also,  a "mean  of  means"  or  arithmetic  log  likeli- 
hood response  mode  is  discussed  as  an  alternative  elicitation 
mode  that  may  be  useful  in  information  aggregation  when  more  than 
one  person  is  involved 
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INTRODUCTION 


The  study  reported  here  extends  Bayesian  inference  to  the 
case  in  which  two  people  rather  than  one  must  interpret  information 
and  reach  a group  conclusion.  Each  of  the  two  people  has  condi- 
tionally independent  information  relevant  to  which  of  the  competing 
hypotheses  is  true.  The  information  of  one  subject  is  different 
from  that  of  the  other subj ect , and  even  the  combination  of  data 
available  to  both  cannot  provide  certainty.  One  of  the  two  sub- 
jects has  more  diagnostic  information  than  the  other.  No  conflict 
exists  over  what  is  known.  The  two  must  combine  their  information 
to  make  a joint  inference  about  the  probabilities  of  the  two  hy- 
potheses . 

Studies  of  inference  in  the  one  person  case  have  been  summarized 
in  Edwards  (1968),  Anderson  (1971)  and  Slovic  and  Lichtenstein  (1971) 
Generally,  people  are  conservative  as  individuals  in  their  judg- 
ments. They  fail  to  combine  the  information  to  modify  prior  probabili 
ties  as  strongly  as  the  data  would  justify  (Wheeler  and  Edwards, 
1975) . 

Eils,  Seaver  and  Edwards  (1977)  showed  that  subjects  avoid  con- 
servatism if  asked  to  assess  an  average  impact  of  information, 
rather  than  cumulating  information  over  successive  data.  A cumu- 
lative response  requires  a response  number  outside  the  range  of 
the  inputs,  while  an  average  requires  a response  somewhere  in  the 
middle  of  that  range.  Apparently,  averaging  is  far  easier  than 
cumulating. 

This  experiment  explores  a kind  of  information  aggregation 
problem  for  which  cumulating  is  essential.  It  provides  one  set 
of  data  to  one  subject  and  another  set,  of  different  diagnosti- 
city,  to  another  subject,  and  then  requires  a joint  assessment  of 
certainty  based  on  both  sets  of  data.  No  averaging  response  mode 
that  makes  easy  intuitive  sense  can  be  designed  for  this  task. 


I . FORMAL  RULES 


Bayes  Theorem,  the  formula  for  calculating  the  exact  prob- 
ability of  one  hypothesis  as  compared  with  another  given  n 

data  and  a prior  odds,  is: 

n 

fin  - «o  (1) 

i « 1 

where  is  the  prior  odds  in  favor  of  the  hypothesis  A 

over  itr,  alternative  hypothesis  not  A.  L^  is  the  likelihood 

ratio  for  the  ith  datum,  and  fin  the  revised  posterior  odds. 

With  log  transformation,  this  formula  becomes: 

n 

Log  = log  ^0  + E log  Lj  (2) 

n = 1 


If  the  subjects  are  able  to  provide  arithmetic  mean  log  likelihood 
ratios  (AMLL) , the  posterior  odds  will  be: 


Here,  the  final  posterior  odds  ratio  is  calculated  after  the  sub- 
ject gives  his  AMLL  estimates. 

When  each  of  two  people  have  independent  information  relevant 
to  a comparison  between  two  hypotheses,  the  aggregation  of  that 
information  should  be  similar  to  the  case  in  which  the  same  person 
has  two  pieces  of  independent  information.  If  the  two  people  ex- 
press their  feelings  b;sed  on  their  own  information  in  likelihood 
ratios,  then  they  should  combine  that  information  using  Equation 
(1),  assuming  that  the  information  available  to  each  is  conditio- 
nally independent  of  that  available  to  the  other.  If  their  assess- 
ments are  expressed  as  posterior  odds  (or  some  quantity  from  which 
posterior  odds  can  be  infered) , the  arithmetic  is  slightly  more 
complicated.  First,  a prior  odds  or  log  prior  odds  must  be  known 
for  each;  in  experiments,  this  is  usually  supplied  in  instructions. 
Then,  either  the  aggregate  likelihood  ratio,  its  logarithm,  or  the 
AMLL  should  be  recovered  separately  for  each,  using  Equation  1,  2, 
or  3 as  appropriate. 

Finally,  the  two  assessments  should  be  combined.  One  way  to  do 
so  would  be  to  perform  the  appropriate  aggregation  for  one  subject 
and  then  use  his  posterior  odds  as  the  prior  odds  for  the  other 
subject.  (Since  the  two  subjects  receive  independent  information, 
this  can  be  done  in  either  order.)  Another  way  is  to  use  Equation 
1 or  2 or  3,  as  appropriate,  to  obtain  an  aggregate  likelihood 
ratio  for  each  subject,  representing  the  total  aggregate  impact  of 
the  data  that  the  subject  has  seen. (Note  that  his  aggregate  can- 
not be  an  AMLL;  it  must  be  a cumulative  rather  than  a mean  quantity.) 
Then  the  two  measures  of  aggregate  impact  for  the  two  subjects 
can  be  combined,  by  adding  them  together  if  they  are  in  logarith- 
metic  form  or  by  multiplying  them  together  if  they  are  in  non- 
log form.  Finally,  the  output  of  this  combination  process  across 
subjects  can  be  combined  with  the  original  prior  odds. 

II.  METHODS 

Subjects 

Twenty  undergraduate  psychology  students  at  the  University  of 
Southern  California  voluntarily  participated  as  subjects. 


All  of  the  subjects  received  generous  participation  credit  in 
their  introductory  psychology  class. 


Procedure 

Subjects  participated  in  pairs.  The  experimenter  told  each 
pair  that  he  had  prepared  samples  from  two  book  bags,  each  filled 
with  1,000  poker  chips.  One  of  the  two  book  bags  contained  a majority 
of  blue  chips  while  the  other  was  predominately  red.  Each  subject, 
working  alone,  then  worked  through  a response  booklet.  The  first 
page  explained  the  task.  Each  subsequent  page  presented  a sample 
(with  replacement)  from  a book  bag;  the  subjects  understood  that 
each  successive  sample  represented  a new  selection  of  which  bag 
was  being  sampled.  Every  such  selection  of  a bag  resulted  from  a 
flip  of  a fair  coin. 

One  of  the  two  subjects  in  each  pair  had  samples  drawn  from 
book  bags  that  had  either  750  red  chips  and  250  blue  ones  or  250 
red  chips  and  750  blue  ones.  These  data  were  relatively  highly 
diagnostic;  d'  = 1.15.  The  book  bags  from  which  the  other  person's 
samples  were  drawn  had  a 600  to  400  ratio  of  red  chips  to  blue  ones 
or  the  reverse.  This  person's  samples  were,  therefore,  relatively 
less  diagnostic  given  the  same  number  of  chips  and  sample  composi- 
tion (d'  = .41).  Subjects  were  informed  of  the  ratio  (750:250  or 
600:400)  from  which  their  samples  came.  The  actual  samples  the 
individuals  saw  are  reprinted  in  Table  1. 

The  likelihood  ratios  in  Table  1 are  easier  to  verify  if  one 
exploits  a useful  property  of  symmetric  binomial  examples  like  this 
one.  (Symmetric  binomial  simply  means  that  the  two  hypotheses  are 
equidistant  from  0.5  in  opposite  directions.  ) In  such  cases  only, 
the  likelihood  ratio  for  any  set  of  observations  is  given  by  the 
equation 


L 


(s  - f) 


(5) 


The  symbols  p and  q refer  to  the  probability  of  the  more  and 
of  the  less  common  chip  in  the  sample,  respectively.  3o  p/q  * 3 
for  the  high  diagnosticity  bag  and  1.5  for  the  low  diagnosticity 
bag.  The  symbols  s and  f stand  for  successes  and  failures  (time- 
honored  terms  from  statistical  applications  of  binominal  arithmetic); 


Table  1.  Actual  20  Samples  for  the  10  Pairs  of  Subjects 


Sample 

High 

Diagnosticity 

True 

Likelihood  Ratio 

Low 

Diagnosticity 

True 

Likelihood 

Rat 

1 

BRBR 

1.00 

(N) 

BBBR 

3.375 

(B) 

2 

RRBRB 

3.00 

(R) 

RS« RRRRR 

11.39 

(R) 

3 

RBBRBBBB 

81.00 

(B) 

BRRBB 

1.50 

(B) 

4 

RRBRR 

27.00 

(R) 

RRBB 

1.00 

(N) 

5 

RRRRR 

243.00 

(R) 

BRBRRRR 

3.375 

(R) 

6 

RRRR 

81.00 

(R) 

RRBRRB 

2.25 

(R) 

7 

RRRB 

9.00 

(R) 

BBRR 

1.00 

(N) 

8 

BRBRBBB 

27.00 

(B) 

BBBBB 

7.59 

(B) 

9 

RBBRRR 

9.00 

(R) 

RRBRRB 

2.25 

(R) 

10 

BRBRR 

3.00 

(R) 

RBRB 

1.00 

(N) 

11 

RBRRB 

3.00 

(B) 

RRRBBB 

1.00 

(N) 

12 

BBBBRBBB 

729.00 

(B) 

RBRBRBB 

1.50 

(B) 

13 

BRBRRB 

1.00 

(N) 

RRRR 

5.06 

(R) 

14 

RRRBRRBR 

81.00 

(R) 

RRRRBBRR 

5.06 

(R) 

15 

BRRBR 

3.00 

(R) 

RBBBBRRR 

1.00 

(N) 

16 

RRRR 

81.00 

(R) 

BBBRRB 

2.25 

(B) 

17 

BBBBBB 

729.00 

CB) 

RRBBRBB 

1.50 

(B) 

18 

BBRBB 

27.00 

(B) 

BRBRB 

1.50 

(B) 

19 

RBBRB 

3.00 

(B) 

RBRRRR 

5.06 

(R) 

20 

BRRRRBRR 

81.00 

(R) 

RRBRBBRR 

2.25 

(R) 

Note:  The  symbol  R means  that  a Red  chip  was  drawn;  B means  that  a Blue 

chip  was  drawn.  Order  is  irrelevant.  The  high  diagnosticity  samples 
come  from  the  750-250  bag;  the  low  diagnosticity  samples  come  from 
the  600-400  bag.  (R)  following  a likelihood  ratio  means  that  it 
favors  the  predominantly  Red  bag;  (B)  means  that  it  favors  the  pre- 
dominantly Blue  bag,  and  (N)  means  that  the  sample  is  neutral,  i.e. 
it  contains  equal  numbers  of  chips  of  each  color. 
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in  this  instance,  a success  is  an  occurrence  of  the  event  nore 
common  in  the  sample,  while  a failure  is  an  occurrence  of  the 
less  common  event  in  the  sample.  Obviously  a sample  with  more 
reds  than  blues  favors  the  predominantly  red  bag  and  a sample 
with  more  blues  than  reds  favors  the  predominantly  blue  bag. 

After  each  subject  individually  had  recorded  his  responses 
to  all  20  samples,  the  two  subjects  in  each  pair  were  brought 
together.  The  experimenter  explained  that  each  sample  shown 
to  one  subject  corresponded  to  the  same  numbered  sample  shown 
to  the  other  subjects,  in  the  sense  that  both  had  come  from 
book  bags  having  the  same  predominant  color. 

This  information,  given  to  ideal  Bayesians,  would  imply  that 
they  could  combine  the  information  in  each  pair  of  samples  just 
by  multiplying  the  two  likelihood  ratios  together.  The  resulting 
product,  multiplied  by  the  prior  odds,  would  yield  an  appropriate 
joint  posterior  odds. 

For  example,  if  a subject  judged  that  sample  number  two  was 
twice  as  likely  to  have  come  from  the  predominantly  red  bag,  as 
from  the  predominantly  blue  bag,  and  his  partner  judged  a likelihood 
ratio  in  favor  of  the  same  bag  of  25:1,  the  group  likelihood  ratio 
should  have  been  50:1  in  favor  of  that  hypothesis,  just  as  if  only 
one  individual  had  obtained  both  pieces  of  information. 

The  experimenter  then  seated  both  subjects  at  a table,  each 
with  his  previously  filled  in  response  booklet  in  front  of  him. 

Yet  another  response  booklet  was  provided,  and  the  experimenter 
asked  the  pair  to  reach  an  agreed-on  assessment  of  the  probability 
that  the  predominant  color  of  the  book  bags  represented  by  the 
pair  of  samples  was  (say)  red.  In  arriving  at  this  agreed-on 
assessment,  neither  subject  was  permitted  to  report  the  sample 
he  had  based  his  judgments  on  to  the  other  subject.  But  they  were 
permitted  to  report  their  assessments  of  likelihood  ratios  based 
on  each  sample.  They  were  instructed  to  consider  the  two  samples 
conditionally  independent,  and  to  consider  their  individually 
assessed  likelihood  ratios  as  correct,  for  the  purpose  of  reaching 
a group  assessment. 

III.  RESULTS 

Four  different  sets  of  numbers  can  be  compared  with  one  another 
for  each  pair  of  subjects  and  each  sample.  They  are: 

A.  Individual  responses  to  each  sample. 
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B.  The  group's  actual  response  to  each  sample. 

C.  A number  calculated  by  multiplying  together  the  likelihood 
ratios  estimated  by  each  of  the  two  members  of  the  group 
for  the  particular  sample. 

D.  A correct  Bayesian  solution,  obtained  by  using  Bayes's 
Theorem  for  each  sample.  In  the  individual  response  com- 
parisons, this  produces  the  appropriate  likelihood  ratio. 

In  the  group  comparisons,  it  is  also  necessary  to  multiply 
together  the  likelihood  ratios  thus  obtained  for  each 
member  of  a pair. 

These  numbers  were  compared  with  one  another  via  regression 
analyses,  keeping  individual  subjects  or  pairs  of  subjects  distinct 
but  aggregating  over  samples.  As  might  be  expected  of  untrained 
subjects,  performing  an  abstract  task  with  little  motivation  to  do 
it  carefully,  the  data  are  highly  variable.  Nevertheless,  certain 
patterns  emerge  from  them  that  can  be  most  easily  displayed  by  look- 
ing at  medians  of  various  kinds.  Table  2 represents  such  medians, 
spelling  out  which  set  of  numbers  specified  above  is  being  correlated 
with  or  regressed  on  what  other  set. 


TABLE  2.  Median  Correlations  ^ Regressions 


Variables  being  related 

r 

r2 

Intercept 

Slope 

Individual  responses  and  correct 
Bayesian  solution,  high  diagnosti- 
city  subjects  only  (A  with  D) 

.59 

.35 

.69 

.31 

Individual  responses  and  correct 
Bayesian  solution,  low  diagnosti- 
city  subjects  only  (A  with  D) 

.76 

.58 

.21 

1.49 

Actual  group  responses  and  products  _ 
of  individually  estimated  likelihood 
ratios  (B  with  C) 

.79 

.61 

.16 

.45 

Actual  group  responses  and  correct 
Bayesian  solution  (B  with  D) 

.54 

.29 

.48 

.23 

Products  of  individually  estimated 
likelihood  ratios  and  correct 

.S3 

.28 

.79 

.50 

Bayesian  solution  (C  with  D) 


Note:  In  calculating  regressions,  the  first  quantity  listed  is  the  predicted 
variable  and  the  second  is  the  x-axis  variable.  All  calculations  are 
based  on  the  logs  of  the  specified  quantities,  expressed  as  numbers 
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The  median  correlations  show  modest  to  medium  relations  bet- 
ween the  variables,  as  is  to  be  expected  from  highly  variable 
data.  (Note  that  they  would  have  been  much  higher  if  they  had 
not  been  folded  at  0.)  The  slopes  for  individual  responses 
show  a familiar  pattern,  standard  for  virtually  every  experiment 
of  this  kind  that  has  ever  been  performed.  For  high  diagnosticity 
data,  the  subjects  were  conservative  (regression  slope  less  than 
1).  For  low  diagnosticity  data,  the  subjects  were  excessive.  Other 
examples  of  the  same  finding  include  Edwards  (1968),  and  Wheeler 
and  Edwards  (1975). 

The  fact  that  actual  group  responses  correlate  fairly  highly 
with  products  of  individually  estimated  likelihood  ratios  means  no 
more  than  that  the  two  members  of  each  group  were  allowed  to  tell 
each  other  what  their  individual  estimates  had  been,  and  were  in- 
structed to  take  those  estimates  as  veridical  for  the  purpose  of 
arriving  at  group  assessments.  But  the  low  slope  shows  that  they 
did  not  arrive  at  these  final  assessments  by  the  normatively  ap- 
propriate procedure.  Direct  observation  showed  that  some  pairs, 
but  not  all,  arrived  at  final  estimates  by  simply  averaging  their 
individual  estimates- -an  easy,  incorrect  approach.  To  check  how 
effective  a theory  simple  averaging  might  be  for  explaining  the 
group  assessments,  we  also  calculated  regressions  between  actual 
group  responses  (y)  and  the  mean  of  the  logs,  rather  than  the  sum 
of  the  logs,  of  the  individual  responses  (x) . The  median  slope  for 
that  regression  is  .89.  Similarly,  we  calculated  the  same  regression 
between  actual  group  responses  and  correct  Bayesian  numbers;  that 
median  is  .47.  These  regressions  are  too  low  to  permit  the  con- 
clusion that  subjects  systematically  averaged  their  estimates.  We 
are  left  with  the  conclusion  that  the  subjects  found  the  task  con- 
fusing, and  adopted  ill-specified  and  confused  strategies  for 
determining  their  responses. 

The  y- intercepts  of  the  regression  lines  are  interesting  numbers. 
For  virtually  all  individual  and  group  responses,  they  are  positive. 
This  fact,  combined  with  the  regression  slopes  uniformly  less  than 
1,  implies  that  the  regression  lines  cross  the  normatively  correct 
45°  line.  Why?  Inspection  of  the  individual  scatterplots , while 
emphasizing  the  disorderliness  of  the  data,  suggest  that  this  is  not 
a statistical  artifact  resulting  from  an  attempt  to  fit  a straight 

9 


line  to  non-linear  data  as  some  versions  of  the  response  bias 
theory  of  conservatism  (See  Edwards  1968)  might  suggest. 


An  interesting  point,  not  previously  discussed  in  the  man- vs. - 
Bayes  literature,  is  that  response  modes,  like  data  analyses,  can 
be  folded  or  unfolded.  A folded  response  mode,  like  that  used  in 
this  study,  requires  the  subject  first  to  specify  the  favored 
hypothesis,  and  thereafter  to  specify  some  appropriate  number  saying 
how  strongly  the  evidence  favors  it.  An  unfolded  response  mode 
does  not  first  require  commitment  to  either  hypothesis  An  interesting 
hypothesis,  consistent  with  the  findings  of  initial  overconfidence 
by  Tversky  and  Kahneman  (1974)-,  is  that  the  initial  task  of  specifying 
the  favored  hypothesis  drives  up  the  assessment  of  how  likely  the  hy- 
pothesis is.  Thereafter,  revisions  of  opinion  based  on  new  data 
proceed  conservatively.  This  idea  would  be  easy  to  test,  but  has 
not  been  tested. 

V V.  CONCLUSIONS 


The  main  function  of  this  experiment  was  to  explore  whether 
subjects  could  properly  aggregate  numberr  representing  individual 
degrees  of  certainty  in  order  to  obtain  a group  number  representing 
the  result  of  combining  information  inside  several  heads.  They  could 
not,  with  the  response  modes  used  in  this  study.  Instead,  they  at- 
temped  to  reach  some  sort  of  compromise  based  on  their  individual 
responses . 

The  study  reconfirmed  old  findings  concerning  the  relation  be- 
tween diagnosticity  and  conservatism:  high  diagnosticity  leads  to 
conservatism,  whereas  low  diagnosticity  leads  to  radicalism. 

The  variability  and  generally  poor  quality  of  the  data  show  the 
importance  of  careful  instruction,  motivation  and  feedback,  and  the 
desirability  of  using  real  stimuli  (e.g  actual  book  bags  and  poker 
chips)  rather  than  a paper- and-pencil  task  designed  to  have  the 
appropriate  formal  characteristics.  This  point  has  also  been  made 
before;  see  Slovic,  Lichtenstein  and  Edwards,  1965. 


It  would  be  premature  to  conclude  from  these  data  that  subjects 
cannot  aggregate  evidence  properly.  It  would  depend  on  the  response 
mode.  Eils,  Seaver,  and  Edwards  (1977)  found  that  if  subjects  were 


asked  to  estimate  arithmetic  mean  log  likelihood  ratios  (Equation 
4)  they  could  do  a quite  good  job  of  aggregating  normally  distributed 
evidence;  it  is  likely  that  the  same  holds  true  for  stimuli  in 
symmetric  binomial  experiments  like  this  one.  It  is  natural  to  extend 
that  response  mode  to  a situation  in  which  evidence  must  be  aggre- 
gated in  several  heads  rather  than  one.  If  the  subjects  in  this 
experiment  had  been  asked  to  estimate  mean  log  likelihood  ratios, 
then  they  could  have  done  an  excellent  job  of  assessing  the  mean 
of  a group  of  their  individual  means  based  on  multi-chip  samples. 
Simple  arithmetic,  performed  on  those  means  of  means,  combined  with 
knowledge  of  the  number  of  data  on  which  each  was  based,  would  then 
give  an  approximation  to  an  appropriate  aggregate  likelihood  ratio. 
How  good  the  approximation  is  depends  on  how  far  the  data  deviate 
from  1:1  likelihood  ratio.  Formally,  the  appropriate  calculation 
would  be  for  each  subject  separately  to  multiply  his  mean  log  like- 
lihood ratio  by  the  number  of  observations  on  which  it  was  based, 
and  then  for  the  pair  of  subjects  to  add  these  products  together. 

The  result  would  be  an  appropriate  aggregate  log  likelihood  ratio; 
added  to  the  log  prior  odds,  it  would  produce  the  correct  log 
posterior  odds.  If  the  subjects  instead  average  their  mean  log  like- 
lihood ratios,  and  either  they  or  the  experimenter  then  multiply 
this  mean  of  means  by  the  total  number  of  observations  that  the  two 
subjects  together  had  made,  a too-high  approximation  will  result. 

How  much  too  high  it  will  be  depends  on  how  much  each  individual 
subject's  mean  log  likelihood  ratio  differs  from  0.  The  farther 
away  both  are,  the  nearer  the  approximation  will  be  to  the  correct 
number . 
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findings  of  earlier  studies  were  reconfirmed.  Individual  subjects 
responses  collected  before  aggregation  showed  conservatism  in  the 
high  d'  condition  and  radicalism  in  the  low  d'  condition.  The 
^Sgi’^gated  final  odds  from  the  pairs  of  subjects  seem  to  reflect 
some  confusion.  Some  of  the  subjects  apparentlv  used  a simple  and 
incorrect  averaging  strategy.  Others  did  not  use  this  strategy 
but  in  general,  pairs  of  subjects  were  unable  to  provide  anything 
but  conservative  final  odds  when  they  aggregated  their  two  opi- 
^.  ions.  Tlie  importance  of  using  real  stimuli,  the  way  the  respon- 
ses were  elicited,  and  the  instructions  that  were  given  to  the 
subjects  are  discussed.  .^Iso,  a "mean  of  means"  or  arithmetic 
log  likelihood  response  mode  is  discussed  as  an  alternative 
elicitation  mode  that  may  be  useful  in  information  aggregation 
when  more  than  one  person  is  involved... 


